Suvrit Sra

92 publications

18 venues

H Index 36

Affiliation

Massachusetts Institute of Technology (MIT), Laboratory for Information and Decision Systems, Cambridge, MA, USA
Max Planck Institute for Biological Cybernetics, T bingen, Germany
University of Texas at Austin, Department of Computer Sciences, Austin, TX, USa

Links

Name Venue Year citations
On the Training Instability of Shuffling SGD with Batch Normalization. ICML 2023 0
Global optimality for Euclidean CCCP under Riemannian convexity. ICML 2023 0
The Crucial Role of Normalization in Sharpness-Aware Minimization. NIPS/NeurIPS 2023 0
Transformers learn to implement preconditioned gradient descent for in-context learning. NIPS/NeurIPS 2023 0
Sign and Basis Invariant Networks for Spectral Graph Representation Learning. ICLR 2023 0
CCCP is Frank-Wolfe in disguise. NIPS/NeurIPS 2022 1
Understanding the unstable convergence of gradient descent. ICML 2022 16
Max-Margin Contrastive Learning. AAAI 2022 0
Beyond Worst-Case Analysis in Stochastic Approximation: Moment Estimation Improves Instance Complexity. ICML 2022 0
Neural Network Weights Do Not Converge to Stationary Points: An Invariant Measure Perspective. ICML 2022 0
Efficient Sampling on Riemannian Manifolds via Langevin MCMC. NIPS/NeurIPS 2022 0
Minibatch vs Local SGD with Shuffling: Tight Convergence Bounds and Beyond. ICLR 2022 0
Understanding Riemannian Acceleration via a Proximal Extragradient Framework. COLT 2022 0
Can contrastive learning avoid shortcut solutions? NIPS/NeurIPS 2021 44
Three Operator Splitting with Subgradients, Stochastic Gradients, and Adaptive Learning Rates. NIPS/NeurIPS 2021 2
Provably Efficient Algorithms for Multi-Objective Competitive RL. ICML 2021 13
Three Operator Splitting with a Nonconvex Loss Function. ICML 2021 4
Online Learning in Unknown Markov Games. ICML 2021 30
Coping with Label Shift via Distributionally Robust Optimisation. ICLR 2021 0
Contrastive Learning with Hard Negative Samples. ICLR 2021 0
Open Problem: Can Single-Shuffle SGD be Better than Reshuffling SGD and GD? COLT 2021 0
SGD with shuffling: optimal rates without component convexity and large epoch requirements. NIPS/NeurIPS 2020 26
Towards Minimax Optimal Reinforcement Learning in Factored Markov Decision Processes. NIPS/NeurIPS 2020 21
Why are Adaptive Methods Good for Attention Models? NIPS/NeurIPS 2020 91
Strength from Weakness: Fast Learning Using Weak Supervision. ICML 2020 23
From Nesterov's Estimate Sequence to Riemannian Acceleration. COLT 2020 50
Geodesically-convex optimization for averaging partially observed covariance matrices. ACML 2020 2
Complexity of Finding Stationary Points of Nonconvex Nonsmooth Functions. ICML 2020 19
Learning Adversarial Markov Decision Processes with Bandit Feedback and Unknown Transition. ICML 2020 62
Why Gradient Clipping Accelerates Training: A Theoretical Justification for Adaptivity. ICLR 2020 0
Conditional Gradient Methods via Stochastic Path-Integrated Differential Estimator. ICML 2019 37
Flexible Modeling of Diversity with Strongly Log-Concave Distributions. NIPS/NeurIPS 2019 10
Escaping Saddle Points with Adaptive Gradient Methods. ICML 2019 59
Are deep ResNets provably better than linear predictors? NIPS/NeurIPS 2019 11
Random Shuffling Beats SGD after Finite Epochs. ICML 2019 0
Learning Determinantal Point Processes by Corrective Negative Sampling. AISTATS 2019 0
Small ReLU networks are powerful memorizers: a tight analysis of memorization capacity. NIPS/NeurIPS 2019 0
Direct Runge-Kutta Discretization Achieves Acceleration. NIPS/NeurIPS 2018 92
Exponentiated Strongly Rayleigh Distributions. NIPS/NeurIPS 2018 12
An Estimate Sequence for Geodesically Convex Optimization. COLT 2018 43
Non-Linear Temporal Subspace Representations for Activity Recognition. CVPR 2018 39
Modular Proximal Optimization for Multidimensional Total-Variation Regularization. JMLR 2018 0
A Generic Approach for Escaping Saddle points. AISTATS 2018 0
Elementary Symmetric Polynomials for Optimal Experimental Design. NIPS/NeurIPS 2017 18
Polynomial time algorithms for dual volume sampling. NIPS/NeurIPS 2017 31
Combinatorial Topic Models using Small-Variance Asymptotics. AISTATS 2017 0
Proximal Stochastic Methods for Nonsmooth Nonconvex Finite-Sum Optimization. NIPS/NeurIPS 2016 155
Kronecker Determinantal Point Processes. NIPS/NeurIPS 2016 26
Stochastic Variance Reduction for Nonconvex Optimization. ICML 2016 503
First-order Methods for Geodesically Convex Optimization. COLT 2016 205
Geometric Mean Metric Learning. ICML 2016 136
Riemannian SVRG: Fast Stochastic Optimization on Riemannian Manifolds. NIPS/NeurIPS 2016 169
Fast DPP Sampling for Nystrom with Application to Kernel Methods. ICML 2016 73
AdaDelay: Delay Adaptive Distributed Stochastic Optimization. AISTATS 2016 31
Fast Mixing Markov Chains for Strongly Rayleigh Measures, DPPs, and Constrained Sampling. NIPS/NeurIPS 2016 33
Gaussian quadrature for matrix inverse forms with applications. ICML 2016 0
Parallel and Distributed Block-Coordinate Frank-Wolfe Algorithms. ICML 2016 0
Efficient Sampling for k-Determinantal Point Processes. AISTATS 2016 0
Fixed-point algorithms for learning determinantal point processes. ICML 2015 44
Matrix Manifold Optimization for Gaussian Mixtures. NIPS/NeurIPS 2015 73
Data modeling with the elliptical gamma distribution. AISTATS 2015 6
On Variance Reduction in Stochastic Gradient Descent and its Asynchronous Variants. NIPS/NeurIPS 2015 191
Large-scale randomized-coordinate descent methods with non-separable linear constraints. UAI 2015 0
Efficient Structured Matrix Rank Minimization. NIPS/NeurIPS 2014 19
Towards an optimal stochastic alternating direction method of multipliers. ICML 2014 59
Randomized Nonlinear Component Analysis. ICML 2014 165
Riemannian Sparse Coding for Positive Definite Matrices. ECCV 2014 55
Fast Newton methods for the group fused lasso. UAI 2014 18
Geometric optimisation on positive definite matrices for elliptically contoured distributions. NIPS/NeurIPS 2013 28
Jensen-Bregman LogDet Divergence with Application to Efficient Similarity Search for Covariance Matrices. TPAMI 2013 157
Reflection methods for user-friendly submodular optimization. NIPS/NeurIPS 2013 76
Fast projections onto mixed-norm balls with applications. DMKD 2012 28
Scalable nonconvex inexact proximal splitting. NIPS/NeurIPS 2012 64
A new metric on the manifold of kernel matrices with application to matrix geometric means. NIPS/NeurIPS 2012 133
Fast Newton-type Methods for Total Variation Regularization. ICML 2011 87
Fast Projections onto ℓ1, q -Norm Balls for Grouped Feature Selection. ECML/PKDD 2011 39
Generalized Dictionary Learning for Symmetric Positive Definite Matrices with Application to Nearest Neighbor Retrieval. ECML/PKDD 2011 50
Efficient similarity search for covariance matrices via the Jensen-Bregman LogDet Divergence. ICCV 2011 77
A scalable trust-region algorithm with application to mixed-norm regression. ICML 2010 41
Efficient filter flow for space-variant multiframe blind deconvolution. CVPR 2010 232
Convex Perturbations for Scalable Semidefinite Programming. AISTATS 2009 9
Workshop summary: Numerical mathematics in machine learning. ICML 2009 0
Block-Iterative Algorithms for Non-negative Matrix Approximation. ICDM 2008 6
Fast Newton-type Methods for the Least Squares Nonnegative Matrix Approximation Problem. SDM 2007 139
Information-theoretic metric learning. ICML 2007 0
Incremental Aspect Models for Mining Document Streams. ECML/PKDD 2006 18
Efficient Large Scale Linear Programming Support Vector Machines. ECML/PKDD 2006 19
Generalized Nonnegative Matrix Approximations with Bregman Divergences. NIPS/NeurIPS 2005 479
Clustering on the Unit Hypersphere using von Mises-Fisher Distributions. JMLR 2005 874
Triangle Fixing Algorithms for the Metric Nearness Problem. NIPS/NeurIPS 2004 19
Minimum Sum-Squared Residue Co-Clustering of Gene Expression Data. SDM 2004 327
Generative model-based clustering of directional data. KDD 2003 122
Copyright ©2019 Universität Würzburg

Impressum | Privacy | FAQ